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Abstract — it is important to reduce keeping costs and hold 
up unscheduled downtimes for machinery. So knowledge of 
what, where and how faults occur is very important. In 
machine rotation and machine learning Fault diagnosis and 
detection are important rule. In this paper offer a method 
based on kernel method that using in fault occur. For this 
reason create kernel by wavelet packet with associate rule 
mining and information fusion for decision rule. This kernel 
has best time detection and optimization misclassification. 
Our proposed data fusion strategies take into account that a 
support vector machine with multi kernel Wavelet-Entropy by 
finding the optimal hyper plane with maximal margin. 

Keywords — Fault Diagnosis, Wavelet Entropy, Information 
Fusion, kernel method. 

I. Introduction 
Numerous studies (both theoretical and empirical) have proved 
that are effective in achieving improved classification 
performance for various application problems. The failure of 
machinery reduces the production rate and increases the costs 
of production and maintenance [1]. Therefore, it is important 
to reduce noise and inspected event in machine learning, so 
knowledge of fault occur is very important. 
In pattern Recognition, kernel method is a Discriminant -based 

classification with linear discriminate analysis 

(LDA) whose suppose conditional probability is Gaussian 
distribution. In large data sets, best selection of kernel is 
important task. 

In this paper we offer a new model for fault diagnosis. This 
research consist of 3 steps for accrue fault diagnosis based on 
kernel method with best position for kernel. 
First step is feature extraction based on wavelet packet with 
associate entropy, in this step input data convert to signal 
model (feature map) by wavelet, and then data extract with 
wavelet packet tree and finally in this step select data by max 
entropy energy. 

In second step create kernel with Mercel kernel model with 
Morlet mother wavelet on extract data for classification. 
In step 3 fused data by kernel fused, in this step selecting best 
kernel in fusion kernel. 

Our proposed data fusion strategies take into account that a 
support vector machine with multi kernel Wavelet-Entropy by 
finding the optimal hyper plane with maximal margin [2]. In 
the distributed schemes, the individual data sources are 
processed separately and modeled by using the Support Vector 
Machine [3]. Fault diagnosis is to detect, isolate, and assess 
faults and failures of engine system and its major components. 

II. Material and Methodology 
In pattern recognition fault is important rule. A pattern is a set 
of objects, processes or events which consist of both 
deterministic and stochastic components [4] .Recognition is 



identification of a pattern as a member of a category that we 
know or we want learns (in Classification known categories 
and in Clustering learning categories) [5]. Therefore, pattern 
recognition have 2 section, in pattern section make a category 
or class of pattern and in section of recognition make a 
decision about the "category" or "class" of the pattern [figure 
1]. 
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Figure 1 : fault diagnosis pattern recognition 

In new method for diagnosis of fault have 3 steps: 

I. Feature extraction using wavelet packet with 

associate rule mining 

II. Kernel method Classification with kernel 

wavelet 

III. Fault decision using Information fusion by 

feature level fusion(kernel fusion) 
/. Feature extraction based on wavelet packet 
transform(WPT) with associate rule mining 
Feature extraction is combining attributes into a new reduced 
set of feature.in pattern recognition and image processing 
feature extraction is a reduce dimension in feature space until 
improve classification. Wavelet transform is powerful than 
other transform because wavelet transform analyze signal in 
both time and frequency domain. 

Selection of suitable wavelet transform for given application 
is important, wavelet packet transform (WPT) was more 
suitable for understanding of the time-frequency 
characteristics. 

Associate rule mining is a method for detection best relation 
between variable in large data sets. one of the quantitative 
measures associated with wavelet packet transform (WPT) is 
Entropy. Entropy can be an associate for WPT with 
mathematical rule. Entropy provides valuable information for 
analyzing non-static signals. For express the signals 
characteristic many various wavelet entropy presented, these 
entropies based on different algorithm so they have different 
essential meaning in application. Wavelet energy entropy is the 
statistical analysis of signal energy on frequency band and 
presents the distributing complexity of signal energy in 
frequency domain. Wavelet energy entropy in this paper used 
to obtain energy distributing information which useful in 
decision rule in information fusion. 
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II. Kernel method 
The second step in fault diagnosis is classification. Support 
vector machine (SVM) can dodge the problems of over local 
minimum in the classical study method, and is applied in many 
classification problems successfully [6]. 

We assume a training set of N data points f^ft^ft} k = 1,2,..., 
N, where x k E ^ is the input data, and Vk E ^ is k-th output. 
The SVM constructs a decision function that is showed 
by: yfc) =w T x+b (2 _i) 

In SVM for the function estimation the following optimization 
problem can be given [7]: 
Min jfc O, fe, 9) = jw T w + cj SjEzi'ff 2 * 
s.t: 



* = * W(2-2) 



Where : Slack variables 

c: A positive real constant 

One defines the Lagrangian: 

L(w,b, &, n) = j]a - S^ =L a k (w T x K + b + e R - 



(2-3) 



With Lagrange multiplier a k the conditions for optimality 
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(2-5) 



With ™ ~~ ^jv ****** J ^ ~~ c xhe support values Cl ~* are 
proportional now to the errors at the data points. 
On the basis of generalized linear critical function, we can map 
the input higher space into feature space by nonlinear 
transform to solve nonlinear problems, and evaluate optimal or 
generalized optimal classification plane in the feature space 
[6]. 

The kernel function can be expressed as folio wings: 
K(x : x>0(xj.0(x') (2-6) 

Then, the decision function of support vector machine can be 
obtained as folio wings: 



f [x)=5ien( ^ D^K(x : x')+b) (2-7) 



i=l 

As support vector machine can't perform the probabilistic 
interpretation of the decision process, relevance vector 
machine is applied to fault diagnosis. The likelihood of the 
training dataset is gained by applying the generalization linear 
model and logistic sigmoid function: 

Where: 



(2-8) 
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y (Xa,w)=*y (V (xj ) =- 



In the study, multiple-kernel wavelet relevance vector machine 
classifier is expressed below. 

The optimal offset ^ can be obtained by the following 

equation: 

n hi 

i=l s=l 

Then, the decision function obtained from the multiple -kernel 
RVM-based classifier can be expressed as folio wings: 



f(x)=sigii[^ ^y.f^ ^( W )+b)] (2-11) 

1=1 3=1 

/avelet function h 



1=1 3=1 

The wavelet function is described as folio wings: 
l 

K(x : x)= 

In this reason, used for: 
yCk)= cos (~^~) exp 



(2-12) 



(2-13) 



This Mercer kernel using Morlet mother wavelet. 
And kernel function is: 



x.x 



2 ) (2-1+) 



a=l 



Wavelet support vector machine and standard support vector 

machine have the same configuration basically, and the 

difference between them is kernel function. 

Therefore kernel function of wavelet support vector machine 

is: 



1=1 



aEz .b^is shift vector 

Thus, the decision-making function for classification is: 
Therefore: 



(2-15) 



j=i i=i 



2.5 



sxp 



Mil \ 



(2-16) 



Where m is the number of training samples x j is the ) training 

V i til 

sample, J is the training object of the J training sample, and 
x is the test sample. 

Fault decision using Information fusion by feature 
level fusion( kernel fusion ) 

The information fusion based on wavelet-entropy is to make 
wavelet packet transformation to preparation data fusion and 
decompose it into different resolution space. Activity measure 
can acquire certain feature information of the multi -resolution 
analysis coefficient of the input image, and decide which 
image has more obvious feature information. The general 
activity measure is a certain function relative to detail 
component amplitude [6] . The definition is : 
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a^{m f n) = S^pfm + m r ,ra + n') * \Df(m -\-m\n-\- n T }\ k 

(2-17) 

is the detail component coefficient matrix and °j n ^ is 

rj E {-m .1.1 " I 

the activity measure of 'J , p is the mask of window 

area and it is used to linear filter J , The activity measures 
said above are calculated by the components of detailed 
components decomposed with every level fail the impact of its 
corresponding proximate components. So the suggestion of 
entropy activity measure, taking the impact of both detail and 
proximate components on the activity measure into attention, 
achieves the objective of improving the effect of fusion 
[7]. Suppose p is the window mask of j th level's detail 



component: 



(2-18) 



For 



every Rj". Suppose in j level approximation coefficient matrix 



Pi 



j is 



(2-19) 



(Normalizing every point is: ^ ,hi -i ) 

According to this, the formula to calculate the window entropy 



of 11 ti is: 



(2-20) 



Then for the j level detail component of input, the basic 
decision making module adopted with information fusion 
algorithm is: 

1 J^'j^injiO < j and flf^ffn.-fsl > <o.f rJr fm,n') 

, f , 1 1 1 J 
:■: 

1 i- - 

- + ^ — Mf^lm.ftJ > j and o£ A (m,it} > a? s (m,f£) 

The superscript s is the directions that detail component 
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algorithm, 



Ml 



is the decision factor of the fusion 
is the entropy activity measure as 



Mf AB (m..rL) 



is 



described and ° is relativity threshold value. 1 
the relative co efficiency of the input images 

M j^S^ n J = £ £ \Ef A tm+ n\ n+n ! ^| *+\D^ { !T.+ fr L '. n + H ') |' (2-21) 
The wavelet coefficient after fusion could be showed as: 
^ (m, ?0 = J |* { (m, nyDf^rn , n) Q-22) 

In formula: M( B ' B ) = 1 and is detail 

component coefficient on j level, in the direction of s [7]. 
Morlet wavelet kernel not only has translation orthogonally, 
but also approximates an arbitrary function in the square 
integral space, such as the classification function / (x). Since 
the Morlet wavelet kernel has the nonlinear mapping ability, 
MWSVM has a good adaptive classification decision making 
ability. However, the actual classification problems are usually 
required to solve multiclass classification. Three approaches of 
creating MSVM by training and combining several classes 
SVM classifier, one-against-all, one-against-one, and 
DAGSVM [5]. The earliest used implementation for MSVM 
classification is probably the one-against-all method [5]. The 
MSVM using one-against-all strategy can be constructed by 
applying the following procedure [6] : 



1- 



5gn[fi(x)] = -1 
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Construct k binary SVM classifiers 

where fi ^ •■ 1 ~ L •••■•^ separates training 
data of class i from the other training data: 

if instance x belongs to class i 
Otherwise. 



2- Construct the k-class MSVM classifier by 
choosing the class corresponding to the 

maximal value of functions 

£W The 

decision function is: 
5(_x) = arg[msxI/ L OE), -^yW}] 
Thus, the determined decision function is: 

Decision strategy: 

d(x) = arg max[v L v N } 

Vi = 1*^/ i = 1,2 D 

D: derivation tree 

*-G 

Where d (x) is the final decision function 

v i is the obtained votes of class i and is the output of the y'th 

MSVM trained by using the jth data source. 

Therefore: 

\k(x v x D ') ■» k^x^l 

D: depth of tree N: Sensors 

Therefore new method for fault diagnosis schema is in figure 
2. 
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Fig 2: new method schema 



This algorithm has best result because: 

i. Time detection 
Time detection is time derivation for extract data and decision. 
My method for feature extraction is Complex Wavelet Packet 
Entropy. This algorithm used Complex data for wavelet tree 
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and 



maximum coefficient on each node for signal 



optimization. Wavelet tree is fast method because coefficient 
of search algorithm in B-tree is 2 with depth D. 
QfJavp-E) = 2 D: depth of wavelet tree 

ii. Time study: 
In kernel method we have: 

1=1 „ 



mi- 



ll. Results and Tables 

We implementation this algorithm with input data whose get 
from teacher huang in mechanical school in WUT. 

Step 1: input data 

Data get from 2 sensors in gearbox machine 
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Figure 3: Data input from 2 sensors 

Step 2: Feature Extraction based on wavelet Entropy 

In this step extract data using wavelet packet (haar function 

with level 5 and selection data with max energy entropy 

For sensor 1: 
1 . Feature extraction based on wavelet entropy 




4 6 3 10 12 14 16 
Figure 3-1: analysis input signal 
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Figure 3-2: wavelet packet Tree 
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Figure 3-3: Max Energy Entropy 
2. Classification using Kernel Method by wavelet 

1, 
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Figure 3-4: Kernel method 

For sensor 2: 

1 . Feature extraction based on wavelet entropy 




Figure 3-5: analysis input signal 
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Figure 3-6: wavelet packet Tree 
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Figure 3-7: Max Energy Entropy 



2. Classification using Kernel Method by wavelet 
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Figure 3-8: Kernel method 
Step 3: Information Fusion (Kernel Fused) 
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Figure 4: fused Kernel 
Step 4: fault occupation 
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Figure 5: Kernel model Classification 
Step 5: test data (confusion matrix) 
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Figure 6: test classification 
IV. Conclusion 

This algorithm offering for fault diagnosis with combine 3 
method: 

Once method is feature extraction using wavelet packet 
entropy. This algorithm used wavelet tree and maximum 
coefficient on each node for signal optimization. Wavelet tree 
is fast method because coefficient of search algorithm in B-tree 

is - with depth D. 

Second method is Multi classification. For this method 
suggested Kernel method with wavelet kernel (the kernel is 
MERCEL kernel using MORLET mother wavelet). for each 
input (sensor) make a SVM with kernel method. This method 
is best method for unsupervised learning model with minimum 
misclassification because in equation 2.16 we 

have &ti ~ c , ^Jb=i a k — therefore we must maximization 

normal vector of a i otherwise we must maximized ff i ,this 
existence in this algorithm. 

Third method is Information fusion. This new algorithm fused 
data on feature level. This method using maximization output 
of each SVM. The maximization model gets minimum time 
detection and time study in search model (Min O (n)).In kernel 
method we analysis kernel for fault size. Kernel method has 4 
items for analysis in fault size: 

a) Orientation of kernel 
Number of peak 

Size of c 



b) 
c) 
d) 



Variance *J (deviation of the data with 
coordinate number I) 
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e =45°, 

and G 



Number of peak=3, 
optimization with 



is 



*=1, 
kernel 



With select 

Kernel=Gaussian 
minimization. 

Summary this algorithm is a best algorithm for fault 
diagnosis with multi input. 
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